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Abstract 

Background: B-cell non-Hodgkin lymphoma represents a diverse group of hematological malignancies, of which 
follicular lymphoma (FL) is one of the most common subtypes. Family and epidemiological studies suggest an 
important genetic role in the etiology of FL In recent genome-wide association studies (GWAS) of FL, several 
genetic susceptibility loci have been identified on chromosome 6p21.33 (rs6457327) and 6p21.32 (rsl 0484561, 
rs2647012) in the human leukocyte antigen class I and class II regions. To identify new genetic variants and further 
elucidate the genetic basis of FL, a meta-analysis was performed of the top 1000 SNPs associated with FL risk from 
two GWAS in the US, Denmark and Sweden (592 cases, 1541 controls), with independent validation in 107 cases 
and 681 controls. 

Results: rs9275517 and rs31 17222 in the HLA class II region were validated and inversely associated with FL risk 
(rs9275517: OR = 0.63, 95% 0 = 0.55-0.73, p = 4.03 x 10"^ \- rs31 17222: OR = 0.66, 95% 01 = 0.57-0.77, p= 1.45 x 10"^). 
rs9275517, which is in high linkage disequilibrium with rs2647012 (r2 = 0.9), was no longer associated with FL after 
conditioning on rs2647012. The rs31 17222 association was independent of established FL SNPs, but not of the 
HLA-DPB1^0301 allele. Using publicly available gene expression profiles with matching genotype information, we 
found that rs31 17222 also was significantly correlated with increased HLA-DPBl expression. 

Conclusions: By performing a meta-analysis of two GWAS of FL, we further validated the relevance of HLA- 
DPBl^OSOl as a protective allele in the pathogenesis of FL. Moreover, the protective rs31 17222 A allele correlated 
with increased levels of HLA-DPBl , suggesting a possible disease mechanism involving HLA-DPBl expression 
regulation. Our results add further support to the major role of HLA genetic variation in the pathogenesis of FL. 

Keywords: Follicular lymphoma (FL), Genome-wide association studies (GWAS), Human leukocyte antigen (HLA), 
Meta-analysis 



Background DQBrOSOl extended haplotype, and rs2647012 is highly 

FolUcular lymphoma (FL) is a B-cell neoplasm that correlated with Di?5in5-DQA7*01-DQ57m [4]. An in- 

represents the second most common form of B-cell non- dependent inverse association with FL risk also was 

Hodgkin lymphoma (NHL). An important role for inhe- found for DPBW301 [4]. Because previous GWAS 

rited genetic susceptibility for FL has been supported by only attempted to validate the top 40 variants asso- 

recent genome- wide association studies (GWAS) where ciated with FL [2,3], here we conducted a meta-analysis of 

three independent susceptibility alleles have been identified the top 1000 SNPs from existing GWAS data in 592 FL 

in the HLA class I (rs6457327) [1] and class II (rsl0484561, cases and 1541 controls from Denmark/Sweden (SCALE) 

rs2647012) regions [2,3]. Follow-up HLA sequencing stud- and the San Francisco Bay Area (SF-NHL2) to identify 

ies revealed that rsl 0484561 is in complete linkage new genetic variants and further elucidate the genetic 

disequilibrium (LD) with the DRBWIOI-DQAVOIOI- basis of FL. Validation genotyping of associated SNPs 

was conducted in 107 FL cases and 681 controls from 
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gene expression levels also was investigated using publicly 
available microarray data. 

Results and discussion 

After excluding the SNPs previously tested for validation 
in the two GWAS [2,3], 62 SNPs located in 20 in- 
dependent loci were associated with FL at a p-vdlue 
threshold of 1x10"^ in the random-effects meta-analysis 
(Additional file 1: Table SI). For each independent locus, 
we selected the SNP with the lowest p-value and removed 
markers in LD. Of the remaining SNPs, only those with 
evidence of association from at least a secondary marker 
in the GWAS were taken forward to validation. Among 
the 11 independent SNPs selected for validation in the SF 
NHLl study, two SNPs, rs9275517 and rs3117222, located 
on 6p21.32 in the HLA class II region, were validated in 
this third independent population (rs9275517: OR = 0.58, 
95% CI: 0.37-0.92, FDR-adjusted p-value = 2.29 x 10'^; 
rs3117222: OR = 0.46, 95% CI: 0.29-0.73, FDR-adjusted 
p-value = 2.83 x 10 ^ Additional file 2: Table S2). In the 
combined analysis of all three studies, rs9275517 (random- 
effects p-value = 4.03 X 10'^^ ^^heterogeneity = 0.76, 1^ = 0%) 

and rs3117222 (random-effects p-value = 1.45 x 10"^; 
^heterogeneity = 0-'77, I^ = 0%) Were associated with FL at a 
genome-wide significant level (/.e., P < 5.0 x 10"^; 
Table 1, Additional file 3: Table S3). 

To determine whether rs9275517 and rs3 117222 are in- 
dependently associated with FL risk, we adjusted our mo- 
dels for the previously identified FL-risk loci, rsl0484561, 
rs2647012, and rs6457327 [1-3]. The results showed that 
rs9275517 was no longer associated with FL after condi- 
tioning on rs2647012 in any of the studies (p-valuesF = 
0.31, p-valuescALE = 0.63, Additional file 4: Table S4). Fur- 
ther, rs9275517 and rs2647012 are in high LD (r^ = 0.9 
in HapMap-CEU), also suggesting that the signal obser- 
ved for rs9275517 is not independent of the previously 
validated protective rs2647012 allele [3]. In contrast, 
rs3117222 maintained independent statistical significance 
when each SNP was fitted in the logistic regression model 
(p<0.02. Additional file 4: Table S4). Because rs3117222 
maps 6 kb downstream of the HLA-DPBl gene, we also 
tested the independence of rs3117222 with HLA-DPBl 
alleles including DPB1V301, an allele inversely associa- 
ted with FL based on HLA sequencing studies of the 
SF-NHL2 study population [4]. Haplotype analysis in the 
SF-NHL2 data showed a significant difference in the fre- 
quencies of the HLA-DPBr0301'rs3U7222 haplotype in 
cases versus controls (p-value = 9.00 x 10'^ Additional file 
5: Table S5) and the LD analysis suggested that rs3117222 
was in LD with HLA-DPB1V301 (r^ = 0.24, D' = 0.96). 
men DPBrOSOl was included in the model, rs3117222 
was no longer associated with FL (p-value = 0.33), indica- 
ting that although rs3117222 is independent of previous- 
ly reported FL-associated SNPs, its influence on FL risk 



cannot be delineated from the protective HLA-DPBl V301 
allelic association. Nonetheless, through rs3117222, this 
study provides the first validation of the HLA-DPBl locus 
as protective in the pathogenesis of FL in two additional 
independent studies (SCALE and SF-NHLl). 

To explore whether rs31 17222 affects HLA-DPBl gene 
expression, we used two publicly available mRNA ex- 
pression datasets from the MuTHER [5] and Gen Cord 
[6] projects. We found a strong correlation between the 
protective rs31 17222 A allele and increased HLA-DPBl 
expression in all lymphoblastoid cell lines in both data- 
sets (Figures 1 and 2), indicating that enhanced HLA- 
DPBl expression may play a protective role in the etiology 
of FL. This hypothesis is in line with previous studies 
where reduced HLA class II expression on Hodgkin 
Reed-Sternberg cells and diffuse large B-cell lymphoma 
(DLBCL) tumor cells has been associated with poor sur- 
vival for classical HL [7] and DLBCL [8], respectively. 
Based on a recent HLA sequencing study of FL [4], we 
showed that the protective versus deleterious HLA-DPBl 
alleles possess oppositely charged glutamic acid rather 
than lysine residues at position 69 in binding pocket 4, 
factors that may influence peptide affinities for HLA- 
DPBl [9], The present study suggests that an additional 
mechanism involving effects on HLA-DPBl expression 
may also influence FL risk. Further studies will be needed 
to confirm and clarify the specific mechanisms through 
which reduced HLA-DPBl expression may contribute 
to deregulated cellular processes that drive FL and its 
progression. 

Conclusions 

In the present study, we found that rs3 117222 was in- 
versely associated with FL risk independent of previously 
published FL SNPs, but not independent of the HLA- 
DPBVmOl allele. Elevated levels of HLA-DPBl trans- 
cripts found in the presence of the protective rs3117222 A 
allele suggest that changes in HLA-DPBl expression may 
be involved in the etiology of FL. These findings em- 
phasize the important role of HLA genetic variation in FL 
disease etiology and the need for further studies to clarify 
the mechanisms through which HLA class II expression 
contributes to FL pathogenesis. 

Methods 

Description of GWAS subjects, genotyping and statistical 

analyses 

SF-NHL2 GWAS 

Full details of the study design and methods, genotyping, 
quality control and statistical analyses have been described 
elsewhere [2]. A total of 312,768 markers genotyped in 
213 FL cases and 750 controls passed our quality control 
criteria and were used for genome-wide association ana- 
lysis. Population stratification and cryptic relatedness were 



Table 1 Summary results for the validated follicular lymphoma associated single-nucleotide polymorphisms 



SNP Gene(s) LOG AV SF-NHL2 GWAS SCALE GWAS SF-NHL1 Validation Combined meta-analysis 





within 5 kb 
(hg18) 


(hg18) 


A2 


Logistic 
p-value 


Logistic 
OR (95% CI) 


MAF 
ca/co 


Logistic 
p-value 


Logistic 
OR (95% CI) 


MAF 
ca/co 


Logistic 
p-value 


Logistic 
OR (95% CI) 


MAF 
ca/co 


P-value* 


OR* 
(95% CI) 


Q 


1^ 


rs9275517 




chr6 32782627 


A/G 


7.56E-04 


0.67 


0.29/0.38 


6.73E-07 


0.63 


0.37/0.48 


2.32E-03 


0.57 


0.29/0.41 


4.03E-1 1 


0.63 


0.7596 


0% 












(0.53- 0.85) 






(0.53-0.76) 






(0.39-0.82) 






[0.55-0.73] 




(0%-62.2%) 


rs31 17222 


HLA-DPBl, 


chr6 33168927 


A/G 


4.17E-03 


0.67 


0.18/0.25 


3.06E-04 


0.68 


0.19/0.26 


5.72E-03 


0.58 


0.16/0.25 


1.45E-07 


0.66 


0.7673 


0% 




Q30181 








(0.51-0.88) 






(0.55-0.84) 






(0.39-0.85) 






[0.57-0.77] 




[0%-60.7%] 



*P-values and ORs estimated from the meta-analysis were identical for the random- and fixed-effects models. 
A1/A2 = minor/major alleles; Q = Cochran's Q statistic; 1^ = heterogeneity index. 
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Figure 1 Correlation of rs31 17222 genotypes witii HLA-DPB1 expression in thie MuTHER dataset. Cell lines: F = adipose, L = lymphoblastoid, 
S = skin. 



tested and corrected for as described previously [2], resul- 
ting in a final inflation factor X = 1.04. Associations with 
FL were tested using a Cochran-Armitage trend test in 
PLINK 1.07 [10]. Odd ratios (OR) and 95% confidence 
intervals (95% CI) were calculated using the mid-p me- 
thod from the epitools package in R [11]. 

SCALE GWAS 

Full details of the study design and methods, genotyping, 
quality control and statistical analyses have been pub- 
lished [3]. The final analysis included 298,680 genetic 
variants available for 379 cases and 791 controls. Study 
subjects with evidence of cryptic family relationships 
or outliers in terms of population stratification were re- 
moved as described previously [3], resulting in a final in- 
flation factor \ = 1.028. Wald tests, treating minor allele 
counts as continuous covariates were used to test for 
association. 



Validation study (SF-NHL1) 

Full details of this population-based San Francisco Bay 
Area case-control study of NHL (1,591 cases, 2,515 
controls) have been published [12,13]. Here, a subset of 
HIV-negative, non-Hispanic white individuals (107 FL 
cases, 681 controls) were used for validation. 

The SF-NHLl and SF-NHL2 study protocols were 
approved by the UCSF ethics review committee (Nos. 
10-03698 and 10-00629, respectively), and the SCALE 
study was approved by the Ethical Review Board at 
Karolinska Institutet, Stockholm, Sweden (Dnr 99-154, 
2007/624-32). All study participants provided informed 
consent. 

Statistical analysis 

Meta-analysis of the SF-NHL2 and SCALE GWAS 

We selected for meta-analysis the top 1000 SNPs most 
significantly associated with FL in the SF and SCALE 
GWAS. Among these SNPs, 13 (rs9275572, rsl0484561. 
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Figure 2 Correlation of rs31 17222 genotypes with HLA-DPBl expression in the Gen Cord dataset. Cell lines: F = fibroblast, L = lymphoblastoid, 
T = T-cell. 
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rs6457617, rs2858331, rs3763313, rs7755224, rsl493202, 
rs2857106, rs2157051, rsl2529049, rsl0511017, rs587791 
and rs3130617) were present in the top 1000 of both 
GWAS. Selected SCALE GWAS SNPs that were not ge- 
notyped or did not pass direct genotyping quality control 
in the SF study were imputed using BEAGLE 3.0.3 [14] 
with haplotype data from HapMap phase II-CEU as a 
reference. Similarly, selected SF GWAS SNPs that were 
not genotyped in the SCALE study were imputed from 
the 1000 Genomes pilotl CEU (August 2009 release) and 
the HapMap Phase II release 22 CEU datasets using 
IMPUTEvl [15]. ORs for each individual study were es- 
timated using unconditional logistic regression under an 
additive model in PLINK. Before meta-analysis, genomic 
control (GC) was applied to each study by multiplying the 
standard error of the effect estimates by the square root of 
the study-specific inflation factor (A = 1.04 in SF, A = 1.03 
in SCALE). ORs and GC-corrected standard errors 
were then combined in a meta-analysis under fixed- and 
random-effects inverse variance models using the metagen 
function from the meta package in R [16]. Heterogeneity 
across studies was tested with the Cochran's Q test and 
quantified with the I^ heterogeneity index. 

Validation - SNP selection and analysis 

We selected for validation those SNPs with a random- 
effects p-value threshold <10"^ in the meta-analysis. To 
focus on newly associated loci not previously reported, 
the top 40 SNPs associated with FL from each GWAS 
were excluded from further validation, as these had been 
tested previously [2,3]. For the remaining markers, we 
used PLINK to group SNPs in LD based on genotypes 
from HapMap-CEU r28, resulting in 20 independent loci 
(r^ < 0.05). Eleven of these SNPs were taken forward to 
the validation stage based on 1) lowest meta-analysis 
p-value and 2) evidence of a secondary signal at each 
locus in the original GWAS. Association analyses were 
conducted using trend and logistic regression tests in 
PLINK 1.07. P-values were adjusted for multiple com- 
parisons using the Benjamini-Hochberg FDR correction 
from the p-adjust function in R [17] and considered sig- 
nificant at a FDR adjusted p-value level = 0.05. ORs and 
95% CI were calculated for the variant allele carriers 
using the epitools package. To estimate overall associa- 
tion in the discovery and validation phases, logistic re- 
gression ORs and standard errors for the validated SNPs 
were combined with GC-corrected estimates from the 
original GWAS in a meta-analysis under fixed- and 
random-effects models. 

Statistical analysis adjusting for previously associated 
variants 

Logistic regression analysis was conducted adjusting 
for the additive effects of FL-associated SNPs that were 



entered separately into the model as covariates using a 
0,1,2 allele dosage coding. The same approach was used in 
the SF-NHL2 dataset to test for independent effects 
between SNPs and HLA alleles that were previously 
typed in 205 FL cases and 82 controls as part of the SF 
GWAS [4]. HLA alleles were coded as binary alleles where 
A = present and B = absent. 

Linkage disequilibrium (LD) and haplotype analysis 

LD metrics between SNPs were based on European sam- 
ples (CEU) from HapMap release 28 [18]. LD between 
rs31 17222 and HLA-DPBl alleles were based on geno- 
type and HLA typing data from the SF GWAS study. 
Haplotype analyses were carried out with Haploview 
[19]. 

Gene expression analysis 

Associations with gene expression were investigated 
using two publicly available datasets, one with 166 adi- 
pose, 160 skin, and 156 lymphoblastoid cell lines derived 
from a subset of healthy female twins of the MuTHER 
study [5], and a second dataset with three cell types 
(fibroblast, lymphoblastoid and T-cell) derived from um- 
bilical cords of 75 Geneva GenCord individuals [6]. Cor- 
relation between SNPs and gene expression levels was 
assessed by the Spearman rank correlation test with 
p-values adjusted by permutations using the Genevar 
application [20]. 

Additional file 



Additional file 1: Table SI. List of single-nucleotide polymorphisms 
(SNPs) that were significantly (random effects p-value < 10"^) associated 
with risk of follicular lymphoma (FL) in the meta-analysis of 592 FL cases 
and 1541 controls from Denmark/Sweden (SCALE) and the San Francisco 
Bay Area (SF-NHL2) studies. Table S2. Results for the 1 1 single-nucleotide 
polymorphisms (SNPs) selected for validation in an independent follicular 
lymphoma case-control study from the SF Bay Area (SF-NHLl). Table S3. 
Meta-analysis of the combined GWAS and validation datasets for the 1 1 
SNPs selected for validation. Table S4. Logistic regression results for the 
validated SNPs in the SF-NHL2 and SCALE GWAS. P-values were 
computed in a logistic regression model with and without adjustment 
for established FL-associated SNPs. Table S5. Case-control frequencies 
and association p-values for the most frequent HLA-DPBl/rs31 1 7222 
haplotypes in the SF-NHL2 population. 
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